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Abstract — The paper has two objectives. The first is to study 
rigorously the transient behavior of some peer-to-peer (P2P) 
networks whenever information is replicated and disseminated 
according to epidemic-like dynamics. The second is to use the 
insight gained from the previous analysis in order to predict 
how efficient are measures taken against P2P networks. We first 
introduce a stochastic model which extends a classical epidemic 
model, and characterize the P2P swarm behavior in presence 
of free riding peers. We then study a second model in which a 
peer initiates a contact with another peer chosen randomly. In 
both cases the network is shown to exhibit phase transitions: a 
small change in the parameters causes a large change in the 
behavior of the network. We show, in particular, how phase 
transitions affect measures of content providers against P2P 
networks that distribute non-authorized music or books, and 
what is the efficiency of counter-measures. 

I. Introduction 

Along with the worldwide penetration of the Internet, a 
huge demand has appeared to copyrighted music and movies 
that have been accessible for free over the Internet. While 
benefiting a very large internaut community as well as poten- 
tially providing higher benefits for Internet access providers, 
it seems unclear whether the creators and the copyright own- 
ers have gained anything from this unregulated access. Two 
opposing approaches appeared, both proposing to protect the 
copyright owners. The first consists of fighting against non- 
authorized access whereas the second one, aims at finding 
cooperative solutions that would benefit both the Internauts as 
well as all other economic actors. An example of a cooperative 
solution is a flat taxation that would allow Internauts to pursue 
downloading freely music and films, and that would distribute 
the tax money between the copyright owners. This cooperative 
approach has several difficulties in its implementation; a major 
one is how to distribute the tax income fairly. A major 
drawback of the confrontation policy is the huge monitoring 
effort that it requires and that seems not to provide credible 
evidence for unauthorized downloads [1]. In order to assess the 
efficiency of non-cooperative measures against unauthorized 
downloads, the authors of [2] have analyzed the impact of 
the effort, of the authorities or of content provider companies, 
invested in (i) reducing file uploading in P2P networks and in 
(ii) reducing the demand for files, on the availability of files 
and, more generally, on the operation of the P2P networks. 
The stationary analysis there is based on a M/G/oo queuing 
model. 

In this paper we are interested in predicting the impact 



of measures as described in the previous paragraph, on the 
transient behavior of torrents. By how much should the request 
or departure rate in a P2P network be reduced in order to 
have a significant change in file availability? To achieve that, 
we consider abstract models of a torrent in simplified P2P 
networks, where a large number of peers are interested in 
a file which is initially available at a small fraction of the 
population. 

Our models are formulated as epidemic type processes 
of file dissemination. We consider both cooperative peers, 
which are those that make a file available to other peers as 
soon as they obtain the file, and free riders, who leave the 
system immediately after obtaining the file. To understand the 
impact of measures against the cooperative sharing behavior, 
we parameterize the degree of free-riding in the system as well 
as the degree of cooperation. 

The P2P dynamics is modeled by a Markov chain (Section 
Hil l which is approximated in two specific regimes: the first 
(Section ITlTb is the early stage when a large fraction of the 
population does not yet have the file. The system is then well 
approximated by a branching process. In the case that there is 
a positive probability of not getting extinct in the first regime, 
the system is shown in Section [IV] to move with some non- 
zero probability to a second regime in which, for the case of 
a sufficiently large population size, its dynamics is close to 
the solution of a differential equation. A similar fluid limit is 
studied in [VI] for the case of limitation on uplink or downlink 
speed. We briefly state our contributions: 

1. Modeling and approximating the transient behavior 
Our first important contribution is to show in what sense each 
of the above two models approximates the original Markov 
chain, and how to use both in order to get the whole transient 
behavior of the P2P network. This is in contrast with all other 
models of P2P networks that we know of, which either use 
only a branching process approach [14] or which use only an 
epidemic mean-field approximation [11]. The latter approach 
(of using only the mean-field limit) is shown to provide a 
tight approximation when the initial number of peers with the 
file scales linearly with the total size N of the population of 
peers. With a fixed initial number of nodes that does not scale 
with N, there is a positive probability of early extinction (see 
Section IVIII for detail) for any set of system parameters, and 
this probability cannot be predicted by the mean-field limit 
alone. 

2. Analysis and identifying phase transitions We first 



study a P2P model that corresponds to the epidemic-like file 
dissemination (Sections UllllVb We then study a second model 
(Section IVH in which, at random times, each peer contacts 
another peer randomly chosen within the set of existing peers. 
In both cases, we show the existence of phase transitions: a 
small change in the parameters causes a large change in the 
network behavior. 

A phase transition occurs both in the branching model for 
the extinction time and in the epidemic model for the file 
availability. In the branching process, the existence of two 
phases was not known to Galton and Watson (considered as the 
founders of branching processes) and was only discovered and 
proved later in [3], In the epidemiology community, the phase 
transition was already known in [4] for a model equivalent 
to our first model without the free riders. For the the second 
model [5], we show the existence of two phase transitions, 
one for the file availability and the other one for the maximum 
torrent size. 

3. Application. In Section [V] we present a counteraction 
against unauthorized file sharing in the presence of illegal pub- 
lishers. We evaluate the impact of measures against Internet 
piracy on the performance of P2P systems in Section rVHl (see 
Figure fTTt . 

The accuracy of the various approximations is investigated 
in Section IVHI related studies are discussed in Section IVIII1 
and concluding remarks are given in Section |IX] 

II. Model 

A. Assumptions 

Assume there is a population N of peers interested in a 
single file. Let Y(t) be the number of peers that possess the file 
at time t. A peer acquires the file when it encounters another 
peer that has the file. We will consider two types of peers: 
cooperative and non-cooperative peers. Once a cooperative 
peer has acquired the file, it stays in the network for a random 
time distributed according to an exponential rv with parameter 
> and then leaves the network. During the lingering 
time of a cooperative peer with the file, it participates in the 
file dissemination. A non-cooperative peer, also called a free- 
rider, leaves the network at once when it receives the file. Note 
that "free riders" in our context is an abstract description of 
noncooperative behaviors, which is different from that in the 
current BitTorrent system. 

Let X c (t) and Xf(t) denote the number of coopera- 
tive peers without the file and the number of free-riders 
(necessarily without the file) at time t, respectively. De- 
fine the process Y := {(Y(t), X c (t), X f {t)),t > 0}. Let 
(y(0),X c (0),X/(0)) denote the initial state of Y that has 
F(0) + X c (0) + X f (0) = N. Let the ratio of various 
types of peers be (y , x cfi , x ft0 ) := (^p- , , ^jp-) ■ 
For simplicity, we introduce new variables N c = X c (0) and 
N f = X f (0). 

We consider an abstract P2P network in which the file 
acquisition is via random contact between pair-wise peers. 
When two such peers meet, the cooperative peer transmits the 
file to the other peer. It is assumed that it takes an exponential 
time with rate A > for a peer without a file to encounter 
a cooperative peer with the file. The transmission of the file 



is always supposed to be successful. This model describes a 
general P2P swarm without a tracker, and even the spreading 
of a file in current Internet. It is inspired by the contact process 
in [5] and [16]. One of the main difference lies in that a peer 
contacts all other connected peers in the system, instead of 
only one random peer periodically. We assume that the file 
transmission time is negligible compared to the time it takes 
for two peers to meet and therefore this time is taken to be 
zero. 

All the random variables (rvs) introduced so far are assumed 
to be mutually independent. As a consequence, if Y(t) = k 
then any peer without the file will meet a cooperative peer 
with the file after a time that is distributed according to the 
minimum of k independent and exponential rvs with rate A, 
that is after a time distributed according to an exponential rv 
with rate Afc. 

Measures of the authorities or of content provider compa- 
nies against file sharing systems may have an impact on the 
decrease in the population N interested in the file and an 
increase in the fraction of free riders among the population 
interested in the file. It can however have an impact also on 
the behavior of cooperative peers that would leave the system 
sooner (i.e. /i is expected to increase). Our model combines 
an epidemic type propagation of the file together with a 
description of the free riding behavior. Define p := XN c //j,. 

We first consider (Section IH-B1 > the case where all peers 
are fully cooperative in the sense that [i — and Xf(0) = 
(no free riders), /i — implies that cooperative peers do not 
leave the network after receiving the file. We then move to the 
general case where [i > and Xf(0) > (Section [H-Cl l. 

B. Fully cooperative network 

When all peers are fully cooperative (i.e. /i = and 
Xf(0) = 0) the population of peers remains constant and equal 
to N, that is, Y(t) + X c (t) = N at any time t. The network 
dynamics can be represented by the process {Y(t),t > 0}. 

This is a finite-state continuous-time Markov process with 
non-zero transitions given by 



Y(i) ->Y(i) + l with rate XY(t)(N - Y(t)). 



(1) 



In other words the process {Y(i),t > 0} is a pure birth 
Markov process on the state-space {yo, . . . , N}, where state 
N is an absorbing state which is reached when all peers have 
the file. 

Define m(t) := E[Y(t)], the expected number of peers with 
the file at time t. Standard algebra shows that 



dm(t) 
dt 



= XE[Y(t)(N -Y(t))], t>0. 



(2) 



Unfortunately, the right-hand side of (0 does not express as a 
function of m(t), thereby ruling out the possibility of finding 
m(t) in closed-form as the solution of an ODE. 

Assume that A is written as A = j3/N and that 
limjv-Kx> N^YiO) = y e (0,1]. Then, for large N, m(t) 
is well-approximated by Ny(t) where y(t) is obtained as the 
unique solution of the ODE [6, Thm 3.1] 



dyit) 

dt 



f(y(t)), t>o, 



(3) 



where f(u) := (3u(l — u) and y(0) = y G (0, 1] (conditions 
(3.2)-(3-4) in [6, Thm 3.1] are clearly satisfied). It is found 
that 

(j/o + (1 - 2/o) exp(-jSt)) 

This is a well-known instance (see e.g. [12]) of what is 
known as mean-field approximation, a theory that focuses on 
the solution of ODEs obtained as limits of jump Markov 
processes [6]. The ODE (0 has been extensively used in 
epidemiology studies, where y(t) represents the fraction of 
infected patients at time t when the population is of size N. 

Proposition [TJ below, whose proof can be found in [22], 
states that the mean-field approximation is an upper bound 
for E[Y(t)]. 

Proposition 1: E[Y(t)] < Ny / {y + (l-y )e-^) Vt > 0. 
C. General network 



We consider the general network defined in Section III-AI 
Define the vector X(t) — where we recall that 

X c (t) is the number of cooperative nodes in the system who 
do not have the file at time t and Xf(t) is the number of free- 
riders in the system at time t (by definition, none of these have 
the file at time i). Let e c = (1, 0) and ej — (0, 1). Under the 
statistical assumptions made in Section Hi- Al it is seen that the 
process Y = {(Y (t) , X (t)) , t > 0} is a finite-state Markov 
process whose non-zero transitions are given by 



Y(t) 
X(t) 



Y(t) + 1 
X(t) - e c 



with rate XY(t)X c (t), (5) 



Y(t) 
X(t) 



(Y{t) - 1 



with rate pY(t), (6) 



(Y(t) 
\X(t) 



Y(t) 
X{t) - e f 



with rate \Y(t)X f {t). (7) 



Throughout this paper we will assume that A > and p > 0. 

The process Y takes its values in the set £ := k), < 

i < 2/0 + N c , < j < N c , < j + k < N - y }. Furthermore, 
all states in £ of the form (0,j, k) are absorbing states since 
there are no more transitions when the file has disappeared. 

An explicit characterization of the transient behavior of the 
absorbing Markov process Y is a difficult task due both to 
the presence of non-linear and non-homogeneous transition 
rates in the state variables and to the dimension of Y. In 
this paper we will instead develop two approximations of the 
Markov process Y. The first one, in Section [Till will consist 
in replacing X c (t) by iV c = -^ c (0) in the transition rate 
(0, which will introduce a birth and death Markov branching 
process. As expected, this (so-called) branching approximation 
will loose its accuracy as the ratio X c {t)/N c decreases. 

The second approximation, in Section [IV] will use an 
asymptotic argument as N — ► oo based on a mean-field 
approximation of Y. This approximation is justified if the 
initial state of Y is of the order of N. Both the branching 
and the mean-fielf approximations approaches will allow us to 
approximate key characteristics of Y such as the probability 
of disappearance of the file, the time before all files disappear, 
the maximum number of cooperative peers in the network and 
the fraction of peers that eventually receive the file. 



III. Branching approximation 

Let Y fc := {Y b (t), t > 0} be a Markov process on N := 
{0, 1, . . .} (the subscript b refers to "branching") with non-zero 
transition rates given by 



Y b (t) Y b (t) + 1 
Y b (t) Y b (t) - 1 



with rate \Y b (t)N c (8) 
with rate ^Y b (t) (9) 



where we recall that N c is the number of cooperative peers 
without the file at time t = 0. 

Since X c (t), the number of cooperative peers without the 
file at time t, is non-increasing in t, a quick comparison be- 
tween (0)-(|7| and (O-© indicates that the process Y^ should 
dominate the process Y. This bounding result is formalized 
and proved in the proposition below. 

A word on the notation: a real-valued rv Z\ is stochastically 
smaller than another real-valued rv Z^, denoted as Z\ < st Z^, 
if P{Zi >x)< P{Z 2 > x) for all x. 

Proposition 2: If F(0) < Y b (0) then Y{t) < st Y b (t) for 
any t > 0. 

The Markov process Y^ is an absorbing continuous-time birth 
and death process on N with absorbing state 0. Because 
its transition rates are linear functions of the system state, 
this is also a continuous-time Markov branching process [9], 
namely, a process in which at any time t each member of 
Yb(t) evolves independently of each other. The next section 
specializes known results of the theory of branching processes 
to the process Y;,. 

A. Extinction probability and extinction time 

As previously observed the process Y^ is a birth and death 
branching process [9, Chapter V]. Each object (peer) of this 
process has a probability of change in the interval (i, t + h) 
given by bh + o(h) with u = XN C + /.i; with probability po = 
fi/u an object dies (a peer leaves) and with probability p 2 = 
XN c /u an object is replaced by two objects (a peer receives 
the file). 

Given Y&(0) = k the extinction time Tk is defined by 

T b {k) = min{i > : Y b {t) = 0} 

Let G k (t) := P(T b (k) < t) be the CDF of T b (k). Given 
Y b (0) = k, the extinction probability, is given q k = 
G fe (oo). The CDF of T b (l) is obtained from [9, Eq. (7.3), 
p. 104] and is given by 

I _ P -M(l-P)t 

G ^)= 1 _ pe -^- P )V ^0, (10) 

where we recall that p = \N c /fi. From (TTOb we find 

qi = mm{l, 1/p}. (11) 

In other words, the extinction will be certain iff p < 1. Since 
all objects behave independently of each other we have q k = 
= min{l, l/p k } and 

k 



G fc (i) = G!(f) fe = 
In particular, if p = 1, 

G k (t) = 



I _ g-M(i-p)* 



1 



pe 



fit 



-m(i-p)* 



t > 0. 



1+fit 



t > 0. 



(12) 



(13) 



B. Expected time to extinction 

Assume that p < 1 (extinction is certain). The expected 
extinction time is equal to E[Tb(k)] = J (1 — Gk(t))dt. In 
particular 

log(l - p) 



E[T b {l)] = - 



ftp 



(14) 



Let us now come back to the original process Y. Define 
T(y ) := mi{t : Y(t) = 0}, the first time when the file has 
disappeared from the network given that Y(0) = yo- When 
Y(Q) = Y b (0) — yo, Proposition |2] implies that 

P(T(y ) >t) = P{Y(t) > 0) < P(Y b (t) > 0) = G yo (t). 

In particular E[T(y )] < E[T b (y )}, so that E[T(1)] < 
- log g p ~ p) from (HI f or p < 1 . 

IV. Mean-field approximation 

In this section we investigate the behavior of the process Y 
defined in Section IH-CI as N, the number of peers, gets large. 
We first show that this behavior (to be made more precise) 
is well approximated by a deterministic limit solution of an 
ODE, an approach known as mean-field approximation. See 
[6] for the theory and [10], [16], [12] for recent applications 
in the area of file sharing systems. 

Like in Section IH-BI we assume that the pairwise contact 
rate, A, is of the form A = /3/N with (3 > 0. We recall that 
the initial state of Y is given by 

Y(0)=Ny o , X c (0) = Nx c , o , X f (0) = Nx f , (15) 

with yo + x c ,o + 2/.0 = L [The analysis below holds under 
the weaker condition lim N ^o O N- 1 (Y(0),X c (0),X f (0)) = 

(yo,x C)0 ,xf t o).] 

Let vi = (1,-1,0), v 2 = (-1,0,0) and v 3 = (0,0,-1). 
Denote by g(Y,Y + Vi), i = 1,2,3, the non-zero transition 
rates of the process Markov process Y out of state Y = 
(Y 1 ,Y 2 ,Y 3 ). We have (cf. 

g(Y,Y+vi) = ^Y 1 Y 2 ,g(Y,Y+v 2 ) = pY 1 ,g(Y,Y+v 3 ) = ^Y 3 
which can be rewritten as 

g{Y,Y + Vi ) = Nf (Jj,v t \ 1 = 1,2,3 (16) 

where f(u,V\) = /3uiu 2 , f(u,v 2 ) = pui and f(u,v 3 ) = 
fluiu 3 for u — (ui, 112,113). 

We may therefore use Theorem 3.1 in [6] (it is easily that 
conditions (3.2)-(3.4) in [6] are satisfied) to obtain that the 
rescaled process A _1 Y converges in probability as N — >• 
00, uniformly on all finite intervals [0, T], to the solution 
(y, x c ,Xf), < y,x c ,Xf, y + x c + xj < 1, of the system 
ofODEs 

a ( v\ ( y(fac - p)^ 

-Pyx c I (17) 



dt 



\ x f/ 



-0yxf 



with initial condition (yo, x C; o, Xf : o)- 

In particular, for any finite t the solution y, x c , xt of 
(fTTT i will approximate the fraction of peers with the file, the 
fraction of cooperative peers without the file and the fraction 
of free-riders, respectively, at time t. The accuracy of this 
approximation will increase with N, the total number of peers. 



A. Peers that never receive the file: a phase transition 

The fraction of cooperative peers x c and the fraction of free- 
riders x f that do not have the file monotonically decrease (this 
is true also for the original system) to some limit values. They 
can continue decreasing until there are no copies of the file in 
the system, namely until y = 0. 

The first question we wish to address is whether these limits 
are close to or are large. In other words, we wish to know 
whether all (or almost all) peers interested in the file are able to 
obtain it or not. If the answer is no, then we shall be interested 
in computing the fraction of peers that never receive the file. 

Let 6 := (3 /p. From the first two equations in (fTTI i we obtain 
Xc as 



d y _ , 1 

The solution of this differential equation is 

x c + y = O^ 1 lnx c - 



(18) 



(19) 



where (f)(9) := x Cj o + yo — ^ _1 lnx Cj o. Let y rnax be the 
maximum ratio of cooperative peers with the file. According 
to the first equation in d!71 l. y max is reached when x c = O^ 1 
if 9 > 1 and is expressed as 



y" 



-9- 1 (l + \n9) + cj)(9). 



(20) 



When 6 < 1, y max is reached when x c — x Ct o (i.e. at time 
t = 0). On the other hand, as t — > 00 y is approaching 
(since we have assumed that p > 0) so that, from (fl9l l. x c (oo) 
satisfies the equation 



x c (oo) — 9 1 ln(a: c (oo)) — 



= 0. 



(21) 



It is easily seen that this equation has a unique solution in 
(0, x c< o) (note that x c (t) < x Ct o for any t since x c is non- 
increasing from the second equation in (fTTIi). From ( fTTI ) we 
find that x f (t) = ^x c (t) for all f. 

As recalled earlier the mean-field approximation only holds 
for finite t and there is therefore no guarantee that it will 
hold when t = 00, namely, that A^Y -1 will converge 
in probability to (0,x c (oo),x ffiX c (oo) / 'x c ,o) as JV ^ 00. 
However, due to the particular structure of the infinitesimal 
generator of Y this convergence takes place as shown in 
[13, Sec. 5.2] (Hint: consider the rescaled Markov process 
Y := {(Y(t),X c (t),X f (t)),t > 0} with generator g(-,-) = 
g(-,-)/Yi and same state-space as Y, so that starting from 
the same initial condition the terminal values of X c (t) and 
X c (t) (resp. Xj(t) and Xf(t)) will have the same distribution. 
The mean-field approximation for Y shows that the solution 



of the associated ODE's is given by (0, x C) oe 



-fir 



Xffit 



1 



for any t > r, with r the unique solution in (0, 00) of 
x c,o + yo = %c,o e~P T + pr, from which the result follows). 

In summary, as N is large, the fraction of cooperative (resp. 
free riders) peers which will never receive the file is approx- 
imated by x™ m := £ c (oo) (resp. x™ m := XfflX™ m /x c fi) 
where x c (oo) can be (numerically) calculated from (fJTJ. 

We are interested in whether there is an abrupt change 
in content availability (i.e. x c (oo)) with the parameter 9. 
Obviously, if 9 is 0, all the cooperative peers that do not 
have the file at time will never receive it. To find a phase 



transition, we approximate log(x c (oo)) in d2TT i by using its 
Taylor extension at x c _q and obtain 



x c (oo) w ((- - x cfl - j/o ) + 



1 ^ x c (oo) ^ _^ 2 ^ 



1 



Xcfi 



Ox, 



1 



c.O 



Since the expression g( x ^°°^ — l) 2 is bounded, the phase 
transition happens at 6* = l/x c fi. 

Despite the similarity in the definitions of p in Section Hill 
and of Qxcxt in the present section, the phase transition at 
p = 1 is different in nature from that at 6x c $ = 1. The 
former indicates whether or not the file will be extinct while 
the latter will drastically impact the final size of the torrent. 

Figure Q] displays the mapping log 10 (&£ c ,o) — ► x T m f° r 
x cfi € {0.01,0.1,0.3,0.5,0.9} and y = 0.05. The curves 
for x™ m are monotonically decreasing in x Cj o (the curve that 
intersects the vertical axis close to 1 is the one corresponding 
to x C fl = 0.01, and so on.). For each curve we note the 
existence of a phase transition at 8x c .q — 1, which is more 
pronounced as the ratio of free riders increases. 




-1 -0.5 O 0.5 1 1.5 2 

log 1D (e x Q ) with y = 0.05 

Fig. 1. Ratio of cooperative peers (as N is large) that never receive the file 
as a function of log 10 (8x Ct o). 

B. Combining the branching and the epidemic model 

The mean-field approximation is accurate for large N if 
the initial state scales with TV linearly. In the case that ./V is 
very large but the initial condition does not scale with N (e.g. 
Y(0) = 1, X c (0)+X f (0) =N-1), we can do the following. 
Fix some Nq much smaller than N but larger than 1. Use the 
branching process approximation until the number of peers 
with the file is No. Then, switch to the epidemic model. (For 
the branching process, we recall that given that there is no 
extinction, the population size grows exponentially fast). 

V. Control actions against P2P networks 

In this section, we first investigate the major findings in the 
analysis of content availability. A set of control actions are 
proposed to protect copyrighted files against P2P file sharing. 

A. Observations on file availability 

Before proposing the counteractions against illegal P2P 
swarms, we investigate the impact of measures on file avail- 
ability. The main question is how does a decrease or increase 
in one of the system parameters affect measures such as 

• the size of the torrent: the fraction of those who are 
interested in the file and are able to get a copy of it. 
This can be seen as a global availability measure. 

• the extinction probability or the expected extinction time, 

• the maximum availability: the maximum number of 
copies that can be found simultaneously in the system. 
This can be viewed as an instantaneous availability mea- 
sure. 



According the analysis in Sections iHlllIVl all above mea- 
sures depend on the ratio - (or £ equivalently). A small ratio 
~ means a poor availability of the file. However, the contact 
rate A is an intrinsic parameter of P2P swarms that can hardly 
be changed technically. An even more challenging problem is 
that there usually exist several illegal publishers residing in the 
system for a very long time. They aim to spread the copyright 
protected file as wide as possible in the P2P swarm. To combat 
with undesirable file sharing, we present two methods, the 
cooperation control and the pollution attack. The former is to 
discourage the degree of cooperation of peers with the file. 
The latter introduces a number of polluters before the file 
dissemination begins, which can be found in [22] due to the 
page limit of this paper. 

B. Control of cooperation 

We introduce the cooperation control to prevent the dissem- 
ination of copyrighted files. We aim to reduce the degree of 
cooperation (i.e. increasing p) so that the delay of obtaining 
the file is increased. To achieve this goal, the content owner 
can invest a certain amount of money in the very beginning 
to discourage the cooperation of peers. The cooperation con- 
trol does not contradict with our opposition of collaborative 
solutions such as flat tax. In fact, we are focusing on this 
unilateral action of the content owner against unauthorized 
file dissemination. 

We consider the same model as in Section III-CI but we 
now assume that all peers are cooperative and that there is a 
number > of permanent publishers, where the subscript 
N refers to the total number of peers in the system at time 
t = 0. The pairwise contact rate is A = f3/N. Denote by a the 
investment level of the content owner against P2P networks. 
The departure rate is an increasing function of a, denoted 
by p(a). We denote by Yjv(^) the number of non-permanent 
publishers and by -Xjv(i) the number of peers without the 



N — Y, 



N • 



file at time t. Observe that Y N (0) + X N (0) 
If limjv^oo Y N (Q)/N = y and limjv-^oo X N (0)/N = x , 
which implies that liniAr^oo Y* /N = 1—XQ—yo := y* , then, 
by Kurtz's result [6], the rescaled process N~ 1 (Y(t), X(t)) 
converges in probability as N — > oo, uniformy on all finite 
intervals [0, T], to the solution of the ODEs 

£ fy\ = (P(y + y*)x - p(a)y y 

dt\x \ ~/3{y + y*)x 



(22) 



with initial state (yo,xo). From now on we will assume that 

y* >o. 

Consider an arbitrary peer without the file at time t = 
and denote by TV the time that elapses before it receives it. 
Let P N (t) := P(T n < t). Similarly to [12, Page 6] we find 

d ^=m-P N (t)f [YN( y +Y ». (23) 
at 

Solving for Pjv(i) gives 



N 



P N {t) = l-e 



t E[{Y N (s)] + Y' 



■ ds 



t > 0. 



(24) 



Hence 

E[T N ] = / (1 - P N {t))dt = I < 
Jo Jo 



~Pia n ds dt. 



From the above we know that (E[Y N (t)] +Y^)/N -> y 
as N — » oo for every t > 0, so that from ( l24l i 

e -/9/o(tfW+V*)<i« 



lim (1 



(25) 



for every t > 0. On the other hand, lirrijv->oo Yff/N = y* 
implies that for < e < y* there exists N e such that Y£ /N > 
y* - e for all N > N e . Therefore, from d24t . 



(26) 



for N > N e , t > 0. Since the r.h.s of uB is integrable 
in [0, oo), d2~5T l and ( f26b allow us to apply the bounded 
convergence theorem to conclude that 

/>oo />oo 

T(a):= lim (1-P N (t))dt = / /<? (kW+k*)^^. 

Jo w-* 00 Jo 

The objective of the content owner is to choose an investment 
level a > which will maximize its utility 



h(a) :— T{a) — a. 



(27) 



To understand the impact of cooperation control on the delay, 
we present numerical studies in Section Ivnl 

VI. P2P WITH A FIXED REQUEST RATE PER NODE 
A. Model 

In this section we will consider a slight variation of the 
model in [5]: there are N peers at time t = 0, at least one 
of them having a file. Each peer without the file sends a 
request for the file to another peer selected at random. These 
requests are initiated at Poisson rate A > 0. It is assumed that 
a peer with the file leaves the system after an exponentially 
distributed random duration with rate \i > 0. All these rvs are 
mutually independent. Let Y(t) (resp. X(t)) be the number 
of peers with the file (resp. without the file) at time t. We 
have Y(0) + X(0) = N with Y(0) > 1. Under the above 
assumptions Z := {(Y (t) , X (t)) , t > 0} is a Markov process 
on the set £ := {(y, x) e {0, 1, ... , N} 2 : < y + x < N}. 
Let q(z, z'), z = (y, x), z' = (y, x) £ £ , denote its generator. 
Non-zero transition rates are given by 



(Y(t)\ 



Y(t) + 1 
X(t) - 1 



with rate 



XY(t)X(t) 

Y(t) + x(ty 



(28) 



Y{t) 
X(t) 



Y(t) 1 
X(t) 



with rate fjY(t). (29) 



This model differs from our previous model in that the rate 
of increase is normalized by the total number of peers in 
the system. More precisely, the rate in ( f28l ) follows from 
the fact that with probability Y(t)/(Y(t) + X(t)) a peer 
without the file will contact a peer with a file at time t (the 
latter implicitly assumes that a peer may contact itself as 
otherwise this probability would be Y(t)/(Y(t) +X(t) — 1); 
the reason for doing this will next become apparent. Note that 
this assumption will have no effect when N gets large) so that 
the total rate of increase of the number of peers with the file 
is XY(t)X(t)/(Y(t) + X(t)) at time t. 

The same model is considered in [5] with the difference that 
in [5] there is one permanent publisher, thereby implying that 
all peers will receive the file. These authors show that that the 



mean broacast time is O(N) if A < fi and is O (log (TV)) if 
A > [i. Thus, there is a phase transition at A = [i. 

In this section we will instead focus on (i) the fraction of 
peers that will receive the file (in the absence of a permanent 
publisher this fraction is not always equal to 1) and on (ii) the 
maximum torrent size (maximum number of copies of the file 
at one time) as N is large. In both cases we will show the 
existence of phase transitions. 

Our analysis will use Kurtz's theorem [6, Thm 3.1] like in 
Section [IV] Note, however, that both metrics (i) and (ii) above 
require to use the mean-field limit as t — > oo, something that 
Kurtz's result does not cover. 

To overcome this difficulty, we will use the same argument 
as in [13] (see also Section [IV] where this argument was 
already used), taking advantage of the particular structure of 
the infinitesimal generator of the process Z. More specifically, 
it is seen that the generator of Z writes in the form q(z, z r ) = 
yq(z,z') for z = (y,x),z' — (y',x') G £, where non-zero 
transition rates are given in j28l - (l29l . 

Let Z = {(Y(t),X(t)),t > 0} be a Markov process with 
generator g(z, z') and state-space £ (same state-space as Z). 

Since Z has been obtained by changing the time-scale of 
Z, the final values of X(t) and of X(t) will have the same 
distribution (note that the final state of Y(t) and Y(t) is always 
zero since states (0, •) are all absorbing states) and so will have 
the maximum torrent size. 

Since the generator g(z, z') can be written as g(z, z') = 
Nf(z/N, z'/N) (this is where the assumption that a peer may 
contact itself is useful) and since conditions (3.2)-(3.4) in [6, 
Thm 3.1] are clearly satisfied, we may apply [6, Thm 3.1] to 
obtain that, at any finite time t, A^~ 1 (y(t), X(t)) converges 
in probability as N — > oo to the solution (y, x), < y, x < 1, 
y + x < 1, of the ODEs 

y = — /i + \x/(x + y), x — —Xx/(x + y) (30) 

given that limjv^oo A/ _1 (F(0), X(0)) = (y(0),x(0)). Let 
(VO)Xo ) := (y(0), x(0)). We will assume that < y < 1 
and yo + xq = 1 (the case yo = (resp. yo = 1) has no 
interest since it corresponds to a P2P network with no file at 
any time (resp. where all peers have the file at time t = 0). 

B. Phase transitions 

Define £ := X/fi. Adding both ODEs in @D) yields y{t) + 
x(t) = —/it + 1. Plugging this value back into (f3Qb gives 
x(t) = xq(1 — fxtp for < t < 1//J, and, by continuity, 
x\t) = x {l - /jet)* for < t < l/fi with = 0. 

In order to approximate the fraction of peers which will 
never receive the file as N is large, one needs to find the first 
time r > where either x(t) — or y(r) — 0. This time r 
is easy to find as shown below. 

We already know that x(t) > for < t < l/fi and 
x(l/fi) — so that we only need to focus on the zeros of 
y(t) in [0, l/fi}. By writing y as y(t) = (1 - fj,t)(l - x {l - 
/it)^ -1 ) we conclude that the smallest zero of y in [0, 1/fj] 
is (1 — Xq^ 1 )/(J. if £ < 1 and is if £ > 1. Therefore, 



(1 



i/(i-C) 



)ln > if £ < 1 and r = l/fi if £ > 1. 



if 



Introducing this value of r in x(t) yields x(t) 
£ < 1 and x(t) = if £ > 1. In other words, as N is large, 



1/(1-0 



all peers will get the file if £ > 1 and a fraction x 1 ^ 1 ® of 
them will not if £ < 1. In other word we observe a phase 
transition at £ = 1: all peers will get the file if £ > 1 and a 
fraction £ — > Xq 1 ^ will not if £ < 1. 

Let us now turn to the maximum torrent size. As N is large 
it will be approximated by the maximum of y over the interval 
[0,t]. A straightforward analysis of the mapping t — >• y(t) in 
[0, t] shows that 

• it is decreasing if £ < 1 or if £ > 1 and £xq < 1 - these 
conditions can be merged into the single condition £ < 
1 /xq - so that its maximum, y max , is given by y max = 

2/0 = 1- Xq, 

• it is unimodular (first increasing then decreasing) if 
£ > 1/xo, with its maximum reached at t\ := (1 — 
(C^o) 1- ^)//-* ar, d given by 

y max = (zq) 1 /^) - 1) > 0. (31) 

In summary, as N is large, the maximum torrent size is 
approximated by Ny m& * with y max given in OTb if £ > 1/xo 
and y max — l — xq if £ < 1/xo. This shows another phase 
transition (see Fig. |2]i at £ = 1/xo (i.e. at Axo = p) in the 
sense that the torrent is maximum at t = if £ < 1/xo and 
is maximum at a later time if £ > 1/xq. 




5 . 10 15 5 f 10 15 

Fig. 2. Maximum torrent size over N (as N is large) as a function of £ for 
xq = 0.95 (left figure) and xq = 0.8 (right figure). 

VII. Numerical results 

This section has two goals: to investigate the accuracy of 
the approximations developed in the previous sections (to be 
made more precise) and to study the impact of measures 
against non-authorized uploading or downloading on the file 
availability in P2P swarming systems. Due to lack of space, 
we will not report any numerical result for the P2P model 
considered in Section IVIt we will instead focus on the P2P 
model introduced in Section III-CI and on its branching and 
mean-field approximations developed in Sections [III] and IIV1 
respectively (Fig.[3lfT0b. as well as on the optimization problem 
set in Section M (Fig. fTTT). 

For each set of parameters, between 200 and 1000 discrete- 
event simulations of the Markov model in Section III-CI have 
been run. In each figure (except in Fig. |5]|6] where only 
simulation results are displayed and in Fig. QT| where only 
mean-field results are shown) both simulation and approxi- 
mation results are reported for the sake of comparison. Let 
r := (Y(Q) + X c (0))/N be the ratio of cooperative peers at 
time t = and recall that N c = X C (Q) (see Section [II]). The 
total number of peers, N, at time t — is equal to 400 in Fig. 
EE to 300 in Fig. [7] H and to 500 in Fig. [TT] 
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A. File extinction time and the branching approximation 

In this section we focus on Fig. [3][8] Fig. [3] (resp. Fig 
[H compares the CDF of the extinction time obtained by 
simulation and by the branching approximation in (fT2l when 
Y(0) = 1 (resp. F(0) = 3), A = 6 ■ 10~ 3 , fi = 1, and 
for two values of r (r — 0.6 implies that N c = 239 and 
Xf(0) — 160, r = 1 implies that there are no free riders 
(X f (0) = 0) and N c = 399). Note that p = XN C / p, is close 
to 2.4 when r = 1 and is close to 1.43 when r = 0.6. In 
all cases, the simulation and the branching approximation are 
in close agreement up to a certain time (time Tb in Fig. [3} 
which, interestingly, corresponds to the extinction time in the 
branching model. After this time, the extinction of the file in 
the Markov model increases sharply (the larger r the larger 
the increase). In other words, the extinction of the file in the 
original Markov ml has two modes, an early extinction mode 
and a late extinction mode. The former occurs when the file 
disappears before the dissemination has reached its peak value 
(i.e. most peers do not get the file) and the latter when most 
peers leave the network with the file. One may also check 
that the branching approximation provides an upper bound for 
the CDF of the extinction time, as predicted by Proposition 
[2] Last, we note that when there are less cooperative peers 
(r = 0.6) the file lifetime is prolonged (see e.g. point D 
in Fig. [3] where simulation curves for r = 1 and r = 0.6 
cross each other); this can be explained by the fact that there 
are less contact opportunities between cooperative peers when 
r = 0.6. The main difference between Fig. [3] and Fig [4] lies 
in the increase of the probability of the late extinction that 
is steeper with three initial seeds (Y(0) = 3) than with one 
initial seed (Y(0) = 1). 

Simulation results in Fig. [5][6] exhibit the same early-late 
extinction pattern as in Fig. [3} [U they have been obtained for 
A = 25 • 10~ 4 and for two different values of /i, r and Y(0). 

Fig- EE] snow the expected time to extinction as a function 
of the pairwise contact rate A, in the case of an early extinction 
(i.e. for small values of A), for /i = 1 and for two values of 
r. The curves "Model" display the mapping A — > E[T},(k)], 
with E[Tb(k)) the expected extinction time in the branching 
process given Y(0) = k (see Section iHll) . We observe an 
excellent match between the simulation and the branching 
approximation thereby showing that the latter works well for 
early file extinction. Also note that having three seeds instead 
of one greatly extends the expected extinction time. 

B. File availability and the mean-field approximation 

We now look at the fraction of peers that will not acquire the 
file. We assume that F(0) = 10 and we recall that N = 300. 
Fig. |9] (resp. FigjTOb displays this fraction as a function of A 
(resp. /i) for two different values of r (r = 1 corresponding 
to X c (0) = 290 and r = 0.5 corresponding to X c (0) = 140 
and Xf(0) = 150). In each figure, both simulation and mean- 
field approximation results are reported. The fraction of peers 
without the file is a decreasing function of the pairwise contact 
rate A and an increasing function of the cooperation degre 
fi. The mean-field approximation is obtained as the unique 
solution x c (oo) in (0,x c (0) of equation OTT) where the initial 
condition of the ODEs ( fTTt is given by (j/q, x c (0), X/(0)) = 
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Fig. 3. CDF of extinction time for Y(0) 
1, A = 0.006, // = 1. 
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Fig. 5. CDF of extinction time for Y(0) 
1 and different fx. 
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Fig. 7. Early extinction time as a function 
of A with Y(0) = 1. 
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Fig. 8. Early extinction time as a function 
of A with Y(0) = 3. 
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Fig. 9. Fraction of peers without the file 
as a function of A. 

(Y(0)/N,X c (0)/N,X f (0)/N). In both figures we observe a 
remarkable agreement between the simulation and the mean- 
field results (relative errors never exceed 2% when all peers 
are cooperative (r — 1) and never exceed 7% when half of the 
peers are free riders (r = 0.5)). We also note that the fraction 
of peers without the file considered as a function of A (resp. 
H) is larger (resp. smaller) when r = 0.5 than when r = 1; 
this is of course not surprising since, unlike cooperative peers, 
free riders do not contribute to the file dissemination. 

C. Action against unauthorized file downloading 

We now evaluate the impact of actions against unauthorized 
file downloading. For that, we use the framework developed 
in Section [V] Since the simulations in [22] show that, for 
large 7Y, T(a) in Section [V] is a good approximation of the 
expected time, Tjy, needed for an arbitrary peer to get the 



Fig. 11. Investment vs. utility of content 
owner. 

file we only consider the utility function h(a) (see d27l i ). We 
assume that the cooperation degree fj,(a) is given by fi(a) = 
fia. There are 500 peers (N = 500) at time t = including 
two persistent publishers (Y* = 2). We assume that Y(0) = 
so that X(0) = N — Y* = 498. The initial condition of the 
ODE is (y ,x ) = (0,X N (0)/N) with y* = 2/N. Fig. 
[TJJdisplays the mapping a — > h(a), for two different values of 
j3 and fi. We observe that a small investment cannot obviously 
postpone the expected delivery delay of the file, resulting in 
a decreased utility. As the investment grows, the utility of the 
content owner increases significantly. The curves in Fig. QT] 
also show how large an investment has to be to counteract P2P 
illegal downloading. Note the content owner can still have an 
increased utility when the ratio f3/fi(a) is greater than one, 
as the utility is maximized across all curves when the ratio 
j3/fi(a) lies between two and three. 



Fig. 10. Fraction of peers without the file 
as a function of (i. 



VIII. Related work 

There has been a number of work on the mathematical 
studies of structured and unstructured P2P-based content dis- 
tribution. A seminal work can be found in [14]. The au- 
thors propose a continuous-time branching process to analyze 
service capacity (i.e. maximum rate of downloading) and a 
coarse-grain Markov model to characterize the steady state of 
downloading rate. In [11], Qiu and Srikant propose a fluid 
model composed of ordinary differential equations to describe 
the dynamics of BitTorrent systems. Authors in [15] further 
propose a novel fluid model based on stochastic differential 
equations. This new model also extends [11] to multi-classes 
system and is able to describe chunk availability. Mundinger et 
al.[17] propose a deterministic scheduling algorithm to achieve 
the optimal makespan for a structured system which requires 
global knowledge. A coupon model is put forward in [16] 
to investigate the effectiveness of a generic P2P file sharing 
system. 

Recently, the process of file dissemination has attracted a 
lot of attentions. Clevenot et al. adopt a hybrid approach (fluid 
and stochastic) to analyze Squirrel, a P2P cooperative web 
cache in [18]. In [5] Queija et al. study the scaling law of 
mean broadcasting time in a closed P2P swarm with constant 
request rate. Authors in [21] formulate a ball-and-urn model to 
characterize the "flash crowd" effect in a closed P2P networks. 
The content provided by P2P networks such as music, movies 
and software are usually unauthorized. Content provided are 
therefore inclined to combat illegal downloading/uploading via 
technical solutions. Authors of [2] and [19] propose a M/G/oo 
queueing model to access the efficiency of non-cooperative 
measures against unauthorized downloading. Authors in [20] 
propose a similar but elegant queuing model to study the 
impact of bundling strategy of file availability and download- 
ing rate. Our general model is inspired by the one in [5]. 
However, it differs from [5], [11], [14], [19] in four ways: 1) 
we are studying the transient behavior; 2) a peer can initiate a 
number of random contacts, instead of one, with other peers; 
3) we observe several phase transitions in response to system 
parameters; 4) we adopt Markov branching process and mean- 
field approaches to characterize the file dissemination model 
comprehensively. 

IX. Conclusion 

In this paper we have proposed to use the theory of 
continuous time branching process as well as of the dynamics 
of epidemics in order to study the transient behavior of 
torrents that occur in P2P systems. The use of these tools 
allowed us to compute the probability of early extinction of 
the torrent as well as the expected time until that extinction, the 
availability of a file in the system, the maximum availability 
and when it occurs, and the size of the torrent. This is used for 
analyzing the impact of measures to decrease non-authorized 
Internet access to copyrighted files. We identify regimes in 
which the performance measures are quite sensitive to such 
measures and others in which the measures have very limited 
impact. In particular, we present two counteractions against 
unauthorized file sharing in the presence of illegal publishers. 



Our methodology can be extended to analyze file bundling that 
serves as a positive action of file dissemination. 
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